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Extreme value and record statistics in heavy-tailed processes 
with long-range memory 
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Abstract. Extreme events are an important theme in various areas of science because 
of their typically devastating effects on society and their scientific complexities. The lat- 
ter is particularly true if the underlying dynamics does not lead to independent extreme 
events as often observed in natural systems. Here, we focus on this case and consider 
stationary stochastic processes that are characterized by long-range memory and heavy- 
tailed distributions, often called fractional Levy noise. While the size distribution of ex- 
treme events is not affected by the long-range memory in the asymptotic limit and re- 
mains a Frechet distribution, there are strong finite-size effects if the memory leads to 
persistence in the underlying dynamics. Moreover, we show that this persistence is also 
present in the extreme events, which allows one to make a time-dependent hazard as- 
sessment of future extreme events based on events observed in the past. This has direct 
applications in the field of space weather as we discuss specifically for the case of the 
solar power infiux into the magnetosphere. Finally, we show how the statistics of records, 
or record-breaking extreme events, is affected by the presence of long-range memory. 



1. Introduction 

History often turns on extreme events — rare occurrences 
of extraordinary nature — be they man-made or natural 
Examples are global financial crises, military strikes, radical 
political events, or natural disasters such as floods, droughts 
and earthquakes. Extreme events are often associated with 
catastrophes, and the word 'extreme' is sometimes substi- 
tuted by 'freak' to suggest something unnatural and unde- 
sirable. Generally, the economic and social consequences of 
extreme events are a matter of enormous concern. In par- 
ticular, the ever increasing economic and human losses from 
natural hazards underscore the urgency for improved under- 
standing of extreme events to develop effective strategies to 
reduce their impact. 

The surprisingly high likelihood of extreme events is ac- 
tually a key attribute of many complex systems, in both 
natural and man-made environments (see, e.g., [Albeverio 
et al, 2006; Bunde et ai, 2002; Embrechts et al, 2004; 
Galambos et al., 1994; Sornette, 2006]). In particular, we 
know that records must be broken in the future, so if a 
flood design is based on the worst case of the past, then 
we are still not prepared for all possible floods in the fu- 
ture. The classic approach to studying the probability of 
extreme events has been to assume independent and identi- 
cally distributed (iid) event sizes. This has led to a powerful 
statistical theory (see, e.g., [Embrechts et al., 2004; Coles, 
2007; de Haan and Ferreira, 2006]), which has been suc- 
cessfully applied in many cases (see, e.g., [Easterling et al, 
2000; Glaser and Stangl, 2004; van den Brink and Konnen, 
2008]). The latter is also related to the fact that some parts 
of the theory — for example, the limit distributions of block 
maxima — can be extended to a wide class of dependent 
stationary stochastic processes and their associated time se- 
ries [Berman, 1964; Leadbetter et al, 1983; Leadbetter and 
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Rootzen, 1988; Samorodnitsky , 2004]. However, we are still 
far away from a general understanding of extreme events 
generated by such dependent processes, which are abundant 
in nature. Complicating factors typically include slow con- 
vergence and strong finite-size effects [Gydrgyt et al., 2008], 
as well as nonlinear correlations [Bogachev et al, 2007]. 

Strikingly, one often encounters dependence in the form 
of long-range persistence in recordings taken from natural 
systems. Persistence is defined as the tendency that subse- 
quent values in a time series are similar: large values tend 
to be followed by large values and small values tend to be 
followed by small values. Such behavior has been reported 
for water levels in rivers [Hurst, 1951] and in river runoff 
records [Kantelhardt et al., 2003, 2006], in climatological 
temperature recordings [Kosctelny-Bunde et al., 1998; Pel- 
letier and Turcotte, 1997; Huybers and Curry, 2006] and for 
temperature fluctuations in oceans [Monetti et al, 2002], as 
well as in marine data [Roman et al., 2008], to name only a 
few examples. 

The theoretical studies of the extreme value statistics of 
stationary stochastic processes exhibiting long-range persis- 
tence have mainly focused on processes with Gaussian dis- 
tributed event sizes. In this and related cases, the limit 
distribution of the block maxima is the same as in the 
iid case — a Gumbel distribution [Berman, 1964; Eichner 
et al, 2006]. Similar results have been obtained for beta- 
distributed random variables, in which case the Gumbel dis- 
tribution is replaced by the WeibuU distribution [Moloney 
and Davidsen, 2009]. Meanwhile, for distributions with 
power-law tails the block maxima for iid events are Frechet 
distributed [Embrechts et al., 2004], and this is even true 
for a certain class of dependent sequences [Leadbetter et al., 
1983; Leadbetter and Rootzen, 1988]. Power-law tails imply 
that there is no "typical" event size and that indeed event 
sizes vary over many orders of magnitude. Many natural 
systems even obey distributions of Pareto-like (power-law- 
like) heavy tails, such that the second moment of the dis- 
tribution is not defined. In space physics, such heavy-tailed 
distributions have been used to model e.g. magnetic held 
line transport [Pommois et al, 1998], fluctuations in vari- 
ous solar wind parameters [Hnat et al., 2003; Bruno et al., 
2004; Zaslavsky et al., 2008] and auroral indices [Watkms 
et al., 2005; Zaslavsky et al., 2008]. Other examples include, 
size distributions in geology [Caers et al., 1999], meteorology 
[Taqqu, 1987], or sediments [Painter and Paterson, 1994]. 
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Here, we study the extreme value statistics of exactly 
those stationary stochastic processes that are characterized 
by persistent (or anti-persistent) long-range memory and 
heavy tails. These processes are sometimes called fractional 
Levy noise or linear stable fractional noise — see [Watkins 
et at, 2009; Moloney and Davidsen, 2010] for discussions 
and further references. Specifically, we consider those sym- 
metric a-stable distributed processes (0 < a < 2) that are 
characterized by a self-similarity index H with < H < 1. 
These are particularly relevant in the context of the solar 
wind as discussed, for example, by Moloney and Davidsen 
[2010] for the energy influx into the magnetosphere, which 
is captured by the Akasofu e parameter. While it is known 
that the limit distribution of the block maxima remains a 
Frcchot distribution for all a and H [Sarnorodnitsky , 2004], 
wo systematically quantify the finite size corrections. As 
expected, they are particularly pronounced for strong per- 
sistence but they also play a role in the presence of anti- 
pcrsistencc. Moreover, we show that the conditional block 
maxima distributions deviate significantly from the uncondi- 
tional distribution for small block sizes. This history depen- 
dence allows one to make a time-dependent hazard assess- 
ment of the value of the next block maximum based on the 
previous one. We also show how the statistics of records, or 
record-breaking extreme events, is affected by the presence 
of long-range memory and, thus, extend very recent results 
for the Gaussian distributed case [Newman et al., 2010). Fi- 
nally, we apply the time-dependent hazard assessment and 
the record analysis to the e time series derived from ACE 
spacecraft measurements for the years 2000 — 2007. 

The paper is organized as follows: First we review basic 
properties and limit theorems of (maximum) extreme value 
statistics in Sect. 2, followed by record statistics in Sect. 3. 
Wc then describe a-stable- or Levy distributions and discuss 
the generation of stationary symmetric a-stable distributed 
processes with long-range memory in Sect. 4. The pre- 
sentation of our rmmerical results can bo subdivided into 
two parts. While we focus on estimating the extreme value 
statistics of these processes including simple block maxima 
and conditional block maxima in Sects. 5.1 and 5.2, their 
record statistics is presented in Sect. 5.3. Section 6 dis- 
cusses the application of the introduced methodology to the 
solar power influx into the Earth's magnetosphere. Finally 
we conclude in Sect. 7. 

2. Classical Extreme Value Statistics 

In classical extreme value statistics one considers sets 
of independent identically distributed random variables 
{Xi, X2, . . . , X„}, each drawn from the same cumulative 
distribution function F{x). F(x) can typically be repre- 
sented by a unique probability density function P{x). A 
time series {a;i}i=i,...,n can then be understood as one pos- 
sible realization of the set of random variables, {X\ = 
xi,X2 — X2, ■ ■ ■ ,Xn = Xn}, foUowiug this density function 
P{x). The distribution of the (block) maximum or extreme 
value M„ = maxjXi , . . . , X„} ^ is given by 



Pr(M„ <m) = Pr{Xi <m,...,Xn<m) 

n 

= Y[ Pr{X, <m) = F"{m) 



(1) 



The Fischer-Tipett-Gnedenko theorem states that if there 
exists a ronormalization sequence {a„,6„},o„ € IR"*" and 
hn € H, such that 



lim Pr 



< m 



: G(m) , 



(2) 



where G{m) is non-degenerate and d. means convergence 
in distribution, then the asymptotic limit distribution G(m) 
belongs to one of the following three function families [Coles, 
2007]: 



'Gumbei('Ti) = exp |- exp |- {^—^—^ 1 1 

GF.,chet(m) = |°^p|_^^^_.j 



: m < b 
: m > b 



m < b 
: m >b. 



(3a) 
(3b) 
(3c) 



Here, a € R"'", 6 € IR and a > 0. As evident from Eq. (1), 
the exact limiting distribution is determined by F{x) and, 
thus, P{x). 

If, for instance, P{x) corresponds to the normal distribu- 
tion or the exponential distribution, the limit distribution 
of the (block) maxima or extreme values follows a Gumbel 
distribution (Eq. (3a)). In cases where P{x) exhibits power 
law right tails (Pareto-like tails), i. e., P{x) ~ L{x)x~^~°' 
where P(x) is the tail of the distribution P{x), L(x) is some 
slowly varying function, the limit distribution of the (block) 
maximum or extreme value approaches the Prechet distri- 
bution (Eq. (3b)) with the same a > 0. Finally, the limit 
distribution of the maximum falls into the Weibull class for 
some distributions with bounded right tails, such as the beta 
distribution (Eq. (3c)). Note that the convergence to one of 
the three distributions in Eq. (3) requires the condition in 
Eq. (2) to be valid. For example, the condition in Eq. (2) 
is not generally satisfied for the Poisson distribution, the 
geometric distribution, or the negative binomial distribu- 
tion [de Haan and Ferreira, 2006; Emhrechts et al., 1997]. 

All three extreme value distributions in Eqs. (3) can be 
written in a compact form in terms of the generalized ex- 
treme value distribution (GEV) 



G(.) = exp{-[n-^(i-ii)]; 



-l/€ 



(4) 



where -I- indicates the constraint [l + ^ i^^Y] > 0- The 
parameters are the shape parameter ^, ^ e R, the scale pa- 
rameter a, a € IR"*", and the location parameter /j, /j G IR. 
The shape parameter ^ determines the distribution family: 
^ = (interpreted as limj^o) indicates the Gumbel class 
(Eq. (3a)), ^ > the Frechet class (Eq. (3b)), and ^ < 
the WeibuU class (Eq. (3c)). 

In this paper, we exclusively focus on symmetric a-stable 
distributed processes with long-range memory, for which the 
block maxima asymptotically converge in distribution to the 
Frechet extreme value distribution in Eq. (3b) [Sarnorodnit- 
sky, 2004]. 

3. Classical Record Statistics 

The field of record statistics is closely related to extreme 
value statistics since records can be considered as a spe- 
cial type of extreme values: Records are simply record- 
breaking extreme values, i.e., the sequence of extreme val- 
ues for increasing n. Studies of record values and record 
times started with the pioneering work by Chandler [1952] 
followed by important contributions by Reny [1962]. For 
a comprehensive overview and a historic summary we re- 
fer to [GUck, 1978] and [Nevzorov, 2001] and references 
therein. The classical theory of records is based on the 
assumption of iid random variables and has been success- 
fully extended and applied to many natural systems, includ- 
ing earthquakes [Davidsen et al., 2006, 2008; Van Aalsburg 
et al., 2010; Vasudevan et al., 2010; Peixoto et al., 2010], 
climate dynamics [Schmittmann and Zia, 1999; Benestad, 
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2003; Redner and Petersen, 2006; Benestad, 2008; Newman 
et at, 2010; Wergen and Krug, 2010], hydrology [Vogel et al., 
2001] and evolution [Sibani and Jensen, 2009]. 

Specifically, let again {Xi, X2, ■ ■ ■ , Xr} be a sequence of 
iid random variables with a common continuous distribution 
F(x). One possible realization of the set of random vari- 
ables may then be given by the time series {xi}i=i = 

{xi,X2, ■ ■ ■ ,xr}. We denote Xm as an (upper) record if 
Xm = maux{Xi, . . . , Xm} for m < R ^ . As illustrated in 
Fig. 1, the set of records for a given realization is trivially 
ordered and the index m at which the fc-th record occurs is 
defined as the fc-th record time, Tk. Obviously, Ti = 1 and 
7fe>2 ~ inin{j < R : Xj > Xt^ -^}. Thus, Tfc>i as well as 
the number of records after a time t < R in a block of size 
7?, Nt, are random variables themselves. 

Interestingly, many statistical properties of records 
are identical for all sequences of iid random variables 
{Xi}i=i,...,_R and, thus, independent of the distribution 
F{x). This includes the expected number of records and 
their variance. To see this, it is important to realize that 
each Xfe<^ has the same probability of being the maximum, 
which is simply given by 

P(Xfc = max{Xi, . . . , Xm}) = 1/ni . (5) 
Thus, this is in particular the probability that X,n is 
a record, which allows one immediately to calculate the 
expected number of records after a time t and their vari- 
ance [Glick, 1978]: 



ni=l 

t 

= ^ l/m = ln(t) 7 0{l/t) (6a) 

m=l 

Var(iV,) = E - - E A 



are stable [Samorodmtsky and Taqqu, 1994]. In particular, 



the constants Cn are not arbitrary but scale 



with 



= ln(f)-h7 + C(lA)~'r76 



(6b) 



where 7 = 0.577 ... is the Euler-Mascheroni constant. 

Note, that there are other properties of records, as for 
instance the distribution of occurrence times of the fc-th 
records [GUck, 1978], that are also independent of the distri- 
bution of {Xi}i=i,,.,,_R but will not be further discussed here. 
In contrast to the above distribution-independent proper- 
ties, results involving actual record values do generally de- 
pendent on the distribution F{x) [Nevzorov, 2001; Redner 
and Petersen, 2006]. 

4. Heavy-tailed processes with long-range 
memory 

A prototype of stationary stochastic processes with 
heavy-tailed distributions are a-stable processes. A random 
variable X is called stable if any set of independent copies 
of X, {Xi,. . . , Xn}, obeys 

Xl+... + Xn=C„X + dn (7) 

for some c„ £ E"*" and d G IR [Embrechts and Schmidli, 
1994]. This property implies that the shape of the distri- 
bution remains invariant if sums over the random variable 
are considered. If dn = 0, X is referred to as strictly stable, 

and X is symmetric ii X = —X. A generalized central limit 
theorem proves that all distributions that have a probability 
density function P{x) with an associated Fourier transform 
or characteristic function of the form 

'exp ^itS - 7"|f|" (^1 - il3 sgn(t) tan ( — )) } 

exp j it& - 7|t| [ 1 sgn(t) In \t\ j \ 

(8) 



a £ (0,2], and hence, the corresponding distributions are 
called a-stable (aS) distributions. In Eq. (8), the charac- 
teristic exponent or index of stability a G (0, 2], the skewness 
parameter /3 G [— 1, 1], and the scale parameter 7 G R"^ de- 
scribe the shape of the distribution while 5 G IR is the loca- 
tion parameter. Since many of these aS distributions do not 
posses a finite mean and standard deviation, these quantities 
are typically not used to characterize these distributions. 

For symmetric a-stable (SaS) distributions, the skewness 
parameter obeys /3 = 0. In particular, the Gaussian distri- 
bution M(n, o-^) is SaS with (a = 2, ^ = 0, 7 = a/\/2, 5 = 
/i). Throughout this paper we will only focus on SaS dis- 
tributed time series and set without loss of generality the 
location parameter 5 to zero. We will further normalize our 
time series to en.sure 7 = 1 (see Sect. 4.2). Hence, Eq. (8) 
reduces to the simple form 



^„(t)=exp{-|tr} 



(9) 



While even in this case an analytical form of P{x) is typically 
not known, the asymptotic behavior for a G_(0, 2) is char- 
acterized by Pareto-like power-law tails with P{x) ~ x~^~" 
[Somette, 2006]. Consequently, the variance of all non- 
Gaussian aS distributions with a < 2 is undefined. In addi- 
tion, the distribution's mean value is undefined for a < 1. 

4.1. Long-range memory in symmetric a-stable 
processes 

For a sequence of identically but not independently dis- 
tributed random variables {Xi , X2 , . . . , Xn} with a common 

continuous distribution F{x), the associated realizations or 
stationary time series {a;i}i=i,...,„ — {x\,X2, ■ ■ ■ ,Xn} are 
typically characterized by non-trivial correlations. One par- 
ticular example is the presence of long-range linear auto- 
correlations. These can be described by three equivalent 
definitions, namely a power-law behavior in (i) the auto- 
correlation function with correlation exponent < 7 < 1, 
C(s) ^ s^'' , (ii) the power spectrum with spectral ex- 
ponent < /3 < 1, P{f) ~ as well as (iii) the 
fluctuation function with (monofractal) fluctuation expo- 
nent h(2), F{s) ~ s**'^' for scales s, frequencies /, and 
1-/3 = 7 = 2 — 2/1(2); for a comprehensive definition and 
discussion, see [Schumann, 2011] and references therein. 

While such long-range linear auto-correlations imply 
long-range persistence, the reverse is not necessarily true. In 
particular, the above definitions require that a finite variance 
exists which is not the case for a-stable processes with a < 2 
as discussed above. Thus, one has to use an alternative way 
to quantify persistence and long-range memory in general 
for such processes. One possibility is to define persistence 
based on the closely related phenomenon of self-similarity. A 
(continuous) stochastic process X = {X{t),t G IR} is called 
self-similar with a self-similarity parameter H > if V/t > 0, 
t G IR, [Embrechts and Makoto, 2002], 



{X{Kt)} ^ {n^'Xit)} . 



(10) 



It can be further shown that for self-similar SaS distributed 
■f#bi^ises XH,a with < ii" < 1, long-range persistence is 
achieved for H > 1/a, while H = 1/a corresponds to the 
uncorrelated case and anti-persistence emerges for H < 1/a 
[Stoev and Taqqu, 2004]. Thus, persistent behavior can only 
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be observed if 1 < a < 2 and we focus on this range of a 
values in the following. 

4.2. Numerical simulation of self-similar symmetric 

ct-stable processes 

There arc two distinct algorithms typically used to 
generate stationary self-similar SaS-distributed processes 
characterized by the exponents a and H. One ap- 
proach utilizes fractional calculus and derives the frac- 
tional integral or fractional derivative X^^a{t) of an un- 
correlated SaS-process Xait) in Fourier space, Xaiui) = 
{2n)~^ fj^Xa{t') ex.p{—it'oj}dt' , [Chechkin and Gonchar, 
2000], 

= ^^iM, „ = H-l/a. (11) 

Hence, this approach is very similar to the established 
Fourier-filtering technique often used for generating corre- 
lated Gaussian noises, see e. g. [Schumann and Kantelhardt, 
2011]. Note that Chechkin and Gonchar [2000] restrict their 
algorithm to 1 < a < 2 '\ They further discuss why 
high-frequency inaccuracies of fractional integration disturb 
numerical simulations in the anti-persistent case, see also 
[Ghechkin and Gonchar, 2001] for their comprehensive study 
of fractional Gaussian integration. 

The other algorithm is based on discretizing and rmmer- 
ically solving a stochastic integral that defines a cumulative 
self-similar SaS process YH,a{t), often called fractional Levy 
motion, 

(12) 

where Ya{s) is (standard) SaS motion and Cu^a is some 
norming constant. The increments Xn.aik) = YH,a(k) — 
YH,a{k— 1) in the discretized version of Eq. (12) then define 
a self-similar SaS process Xh.u that we study here. Note 
that the norming constant is chosen such that (i) the SaS 
scaling parameter is 7 = 1 as previously discussed for de- 
riving Eq. (9) from Eq. (8) and (ii) the probability density 
functions are equal for the independent and self-similar SaS 
process. A stochastic integral such as in Eq. (12) can be 
discretized and efficiently be solved by exploiting the convo- 
lution theorem and employing a fast Fourier transform, for 
further details we refer to [Stoev and Taqqu, 2004] 

In the following, we present results for self-similar SaS 
processes using the latter algorithm [Stoev and Taqqu, 
2004j but we obtain very similar results using the former 
one [Chechkin and Gonchar, 2000] over the range of its es- 
tablished validity. These processes are parametrized by the 
two parameters H and a (7 = 1, 5 = /3 = in Eqs. (8) and 
(9)). Two further parameters m and M are required by the 
used algorithm. They control the mesh-size (intermediate 
points) and the kernel of the fast Fourier transform, respec- 
tively. We chose the parameters m = 64 and M = 48576 
and generate time series of length n = 1000000. This choice 
of m and M with mM <C n is motivated by our observar- 
tion that an undesired crossover in the scaling behavior of 
different parameters appears at scales of the order mM ®. 
To the best of our knowledge, this has not been noticed be- 
fore. Indeed, much shorter time series with n + M = 2^'^, 
rrimax = 256, and Mmax = 6000 were studied by [Stoev and 
Taqqu, 2004] who concluded that larger values of m should 
be chosen for < a < 1 while smaller m are preferable 
for 1 < a < 2 and small M. This is clearly not supported 
by our simulations for much larger values of n ^, which are 
necessary to estimate extreme value and record statistics of 
self-similar SaS processes reliably . 

For each parameter set (a, H) we generate iVconf = 1500 
long-range persistent data sets For surrogate testing 
we destroy correlations by shuffling those data sets using 
a Mersenne- Twister- 19937 pseudo-random number genera^ 
tor [Matsumoto and Nishimura, 1998] which was also used 
to generate white Gaussian noise being the basis for the un- 
correlated SaS noise ^ . 



5. Effects of long-range memory on the 
statistical properties of symmetric a-stable 
processes 

5.1. Extreme value statistics 

Since the theoretical analysis of extreme value statistics is 
mostly concerned with asymptotic distributions as discussed 
in section 2, these findings are not immediately applicable to 
time series {a;i}i=i,...,„ of finite length n which are measured 
in natural systems. In particular, the exact cumulative dis- 
tribution function F{x) is generally unknown and, hence, 
neither is F"{m) in Eq. (1). Thus, one typically has to as- 
sume that i) the given time series is a single realization of an 
underlying stationary stochastic process and ii) the underly- 
ing dynamics is ergodic such that one can use time averages 
instead of or in addition to ensemble averages. This allows 
one to estimate the distribution of the (block) maximum or 
extreme value A^r = max{Xi , . . . , Xr} from the given time 
series by considering non-overlapping blocks of size R <^ n. 
To be more specific, the sequence of maximum values in dis- 
junct blocks of length R is given by {mj}j=i^..._[„/fl] where 
m,j — max{a;(j_i)fl+i, a;(j_i)fl+2i • • • ,XjR} is the maximum 
in block j and [.] denotes the integer division; see Fig. 2 for 
an illustration. Under the above assumptions, the estimated 
distribution PR{m) of the block maxima should converge 
towards a specific G(m) as 7? 00 if a suitable sequence 
{unjbn} exists as in Eq. (2). If this convergence is suffi- 
ciently fast, one can estimate G(Tri) reliably even for finite 
R (and n) and a finite number of realizations A'conf provided 
that [n/R] x iVconf > 1- 

For independent symmetric a-stable distributed pro- 
cesses, it is well-known that the asymptotic limit distri- 
bution of its extreme values is the Frechet distribution in 
Eq. (3b) [Embrechts et al, 1997] since P(a;) ~ a;"^"". More 
recently, it was proved mathematically that all self-similar 
SaS processes with a < 2 also converge in distribution to the 
Frechet extreme value distribution [Samorodnitsky, 2004]. 
Thus, the presence of long-range memory in the form of 
persistence or anti-persistence does not have any significant 
influence on the asymptotic behavior in this case of station- 
ary processes. However, we find that there arc significant 
finite size effects, i.e. significant differences in the extreme 
value distribution for finite R between the cases with and 
without long-range memory. 

To see this, we estimate the probability density function 
PR{rn) of the maxima or extreme values in non-overlapping 
blocks of length R which are shown in Figures 3 and 4 for 
self-similar SaS processes with different parameters H and 
a including persistent as well as anti-persistent cases. There 
are clear dift'crcnccs between the original data and the in- 
dependent surrogate data, both of which were generated as 
described in section 4.2. These differences are particularly 
pronounced for small values of m and R and for strongly 
persistent cases [H — 0.9). For small m, we observe that 
persistence and anti-persistence lead to different behaviors: 
the probability of finding a smaller m is enhanced in the case 
of persistence when compared to the surrogate data, while 
in the case of anti-persistence the behavior depends sensi- 
tively on R. Nevertheless, in the latter case there seems to 
be a clear and rapid convergence to the distribution of the 
independent surrogate data with increasing R. Specifically, 
there are no significant differences for R > 1000. This ob- 
servation can be readily explained. Anti-persistence implies 
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that in an associated time scries large values tend to be fol- 
lowed by small values and small values are rather followed 
by large values which is similar to an alternating behavior. 
This alternating behavior does not significantly constrain 
the maximum value found in a given block if its size is suffi- 
ciently large. Thus, long-range memory in the form of anti- 
persistence does not play an important role for the extreme 
value distribution even for finite R. 

The situation is very diflferent for the case with long-range 
persistence. Since persistence corresponds to a clustering of 
large and small values, respectively, more blocks will contain 
only smaller values such that the block maximum will also be 
smaller than what would be expected based on independent 
values. Thus, the distribution of block maxima will indicate 
a higher probability to find smaller values compared to the 
independent surrogate data. This is exactly what Figures 3 
and 4 show for H = 0.9, independent of R. As expected, the 
deviations from the extreme value distribution of the surro- 
gate data are larger for stronger persistence as a comparison 
of a = 1.5 and a = 1.8 shows. Thus, long-range memory in 
the form of persistence reduces the probability of observing 
very large block maxima for finite R. This is quantified by 
Fig. 5, which shows the exceedance probability defined as 

oo 

ER{m) = Pr{MR > m) = ( PR{m')dm' . (13) 



This phenomenon is not only true for self-similar SaS 
processes considered here but also for stationary long- 
range correlated Gaussian and exponentially distributed 
processes [Eichner et al, 2006]. 

In order to test how closely the probability density func- 
tion Pii{m) of the block maxima for finite R resembles its 
asymptotic limit given by the generalized extreme value dis- 
tribution in Eq. (4), we employ a maximum likelihood es- 
timation method - already implemented in Matlab's Statis- 
tics Toolbox - to estimate its parameters and to quantify the 
goodness-of-fit [Aldrich, 1997; MathWorks Matlab, R2009b]. 
The values of the shape parameters ^ determining the dis- 
tribution class are reported for all considered values of H 
and a in Table 1 in the Appendix. Fig. 6 illustrates the 
convergence of to 1/a as R ^ oo for selected values. 
The fits themselves are shown in Figures 3 and 4. While 
there are significant deviations from the GEV distribution 
for small R and strong persistence, we find relative agree- 
ment for R > 10000 in all cases considered. However, we 
observe systematic deviations from the theoretical value 1 /a 
for very large R. Although errors increase, the systematic 
underestimation of the shape parameter is possibly not ex- 
clusively caused by finite size effects but might be due to 
short-comings in the GEV fitting algorithm as well as the 
data generation algorithm. 

Further evidence for the rapid convergence to the asymp- 
totic GEV distribution comes from the scaling of the mean, 
(m)j{, the median, medR, and the estimated scale param- 
eter, aR, with R (see Eq. (4)), which is shown in Fig. 7. 
For large R, the scaling approaches the same limit for the 
self-similar SaS processes and the independent surrogate 
data For the rescaling coefficient ajj in Eq. (2), it is 
known that a_R cx R^^" asymptotically [Embrechts et al, 
1997]. This follows from studies of the domain of attraction 
of the Frcchet class, which have proved that if the cumulative 
distribution function F(x) of the stochastic process decays 
asymptotically as F{x) ^ L(x)x^" , then the rescaling co- 
efficient aR = R^^"L(R) for some slowly varying function 
L [Embrechts et al., 1997]. L{R) = constant for SaS pro- 
cesses. It was also shown that centering is not necessary, 
i. e., 6j{ = [Embrechts et oZ., 2004). 

Nevertheless, there exist to our best knowledge no ana- 
lytical derivations on the convergence of (m) r, medR, and 
aR. We expect a similar behavior in the limit Ji — >^ oo where 



a unique Frcchet distribution is approached. While we ob- 
serve a scaling similar to ur for the median in Fig. 7(b) 
and with reservations for the scale aR in Fig. 7(c) there are 
still noticeable deviations for the mean {m)R, and for that 
reason we exclusively focus on medians in our discussion of 
conditional extremes, see Sect. 5.2. Besides this, Fig. 7 
provides further evidence that the asymptotic scaling of the 
renormalization sequence {a„, hn} — which ensures the con- 
vergence in Eq. (2) — is not affected by the presence of long- 
range rricrriory. Yet, we also see clear deviations from the 
asymptotic scaling for small R as expected. In particular, 
we observe a crossover in the medians at scales somewhat 
below i?x = 100 for a = 1.5 and around i?x = 600 - 700 
for a = 1.8 in the medians. Note that this crossover is ap- 
parently independent of H. This indicates that long-range 
memory does not play a role for the deviations from the 
asymptotic scaling. Instead, the 'heavier' the heavy-tailed 
distribution, the smaller _Rx • 

Based on the observed scaling behavior, it is straightfor- 
ward to identify a suitable sequence {aR, to ensure that 
Eq. (2) holds. Namely, we chose 

aR = i?^/" and 6j{ = . (14) 

Using this sequence, we can rescale the probability den- 
sity functions in Figs. 3 and 4 according to Eq. (2) in order 
to obtain a scaling collapse resembling the asymptotic dis- 
tribution. This is shown in Figs. 8 and 9. While the collapse 
in the tails is excellent in all considered cases, the collapse 
is often not as good for small arguments. This is particu- 
larly true for large a and large H and confirms the findings 
above. 

5.2. Conditional extrema and hazard assessment 

As a consequence of long-range memory in self-similar 
SaS processes, the value of X{t) for a given t depends on the 
values at all earlier times. Thus, the probability of finding a 
certain value mj+i as the block maximum also depends on 
the history and, thus, the preceding block maxima mu with 
k < j. This directly allows one to make a time-dependent 
hazard assessment and potentially even predict the size of 
future extreme values based on the history of block max- 
ima. In the absence of memory as for the iid case, only 
time-independent hazard assessment is possible. 

As one possible approach to time-dependent hazard as- 
sessment, we study the statistics of all block maxima rrij+i in 
the sequence of maxima {»tIj}j=i,... jat/h] that follow a block 
maximum rrij = mo within a certain range — see [Schwei- 
gler and Davidsen, 2011] for a related but different approach. 
For example, Figs. 10(a,b) display the conditional probabil- 
ity density functions of block maxima m that follow a block 
maximum mo larger or equal than a threshold. The thresh- 
olds were chosen to obtain 15000 conditional maxima when 
combining all realizations As can be seen, long-range 
persistence shifts the distributions to the right compared to 
the independent surrogate case. The latter is equivalent to 
the iid case. Note that the results for the inverse condition 
(mo is smaller than the threshold) are also indistinguishable 
from the iid case. Panels (c,d) report the corresponding 
exceedance probabilities, clearly indicating clustering of ex- 
tremes in the presence of long-range persistence compared 
with the time-independent case, i. e., larger events are more 
likely followed by large events in the time-dependent case. 

For a more complete understanding we now define the 
conditions mo as the geometric averages of non-overlapping 
and exponentially growing ranges and combine bins at both 
ends to ensure at least 1000 conditional maxima per bin^^. 
Figures 11 and 12 report the full results for conditional medi- 
ans medR{mk,o) = median{mjj=2,...,n|mj-i = mk,o} where 
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the conditions rnk.o arc understood as ranges defined by 
the k-th bill. In addition to the conditional median of self- 
similar SaS distributed time series we also show the condi- 
tional medians for the randomized sequence of block maxima 
for comparison. Shuffling the raw data affects the distribu- 
tion of block majcima depending on the block size R. Hence, 
for generating independent surrogates it is here important 
to shuffle the sequence of block maxima and not the under- 
lying time series {xi}i to preserve the distribution of block 
maxima. Note that shuffling the block maxima is equivalent 
to destroying long-range memory in {xi} on scales larger 
than the block size but preserving memory on smaller scales 
[Schumann and Kantelhardt, 2011]. Considering the surro- 
gates corresponds to a time- independent hazard assessment. 
We observe that medBi'mo) increases monotonically with mo 
in the case of strong persistence, again a clear indication of 
the clustering of block maxima. This dependence becomes 
less pronounced for larger R and indeed it should vanish in 
the limit i? — ^ oo. This is also true for other processes with 
long-range persistence [Eichner et al., 2006]. In the case of 
anti-persistence, the deviations from the expected behavior 
for independent maxima are much smaller and only signifi- 
cant for small R. As expected, the behavior is opposite to 
the case of persistence: medij(mo) decreases monotonically 
with mo. 

5.3. Record Statistics 

While the theoretical results for record statistics pre- 
sented in section 3 are all based on the assumption of iid 
processes, not much is known about the case of stationary 

processes with long-range memory [Newman et al, 2010]. 
For self-similar SqS processes, we first analyze the proba- 
bility P{Nr) to find Nb. records given a sequence of length 
R. As Fig. 13 shows for R = 10000, this probability only 
deviates significantly fioiii the iid prediction which is in- 
dependent of the underlying distribution as follows from the 
discussion in section 3 — in the case of strong persistence 
{H > 1/a). Since persistence implies that large values tend 
to follow large values and small values rather follow small 
values, one expects that there is higher probability to en- 
counter a small number of records as well a larger number 
of records compared to the iid prediction. This is exactly 
what Fig. 13 shows. While the above results could be spe- 
cific to the chosen R, this is not confirmed by the dependence 
of the mean number of records E{Nt) on t with t < R. In- 
deed, Fig. 14 indicates that strong deviations from the iid 
behavior described by Eq. (6a) only occur in the presence 
of strong persistence. While the infiuencc of persistence on 
record statistics is easily detectable in E{Nt) and P(Nii), 
the effect is more subtle if one considers the probability to 
break a record at a time t, P{Xt = max{a;i, . . . ,xt}), see 
Fig. 14. This is related to the fact, that the former are cu- 
mulative measures of the later as follows directly from the 
discussion in section 3. 

Our results are consistent with earlier numerical findings 
for stationary Gaussian processes with persistent long-range 
memory [Newman et al, 2010]. The authors found that 
the expected number of records E{Nt) for fixed t increased 
monotonically with increasing persistence. 

6. Application: Solar power input into the 
Earth's magnetosphere 

The solar wind is a prime example of plasma turbulence at 
low frequency magnetohydrodynamic scales as evident from 
its power-law energy distribution [Zhou et al., 2004], the 
magnetic field correlations [Matthaeus et al., 2005] and its in- 
termittent dynamics [Burlaga, 2001]. While understanding 
this type of turbulence is an important challenge by itself, 
understanding the interplay between the solar wind and the 



Earth's magnetosphere is another open problem of consider- 
able interest [Baker, 2000]. The difficulty of understanding 
the magnetospheric response to the solar wind variations 
is intimately related to the observed range of mechanisms 
of energy release and multiscale coupling phenomena [An- 
gelopoulos et al., 1999; Lui et al., 2000; Unitsky et al., 2002; 
Borovsky and Funsten, 2003; D'Amicis et al., 2007; Urit- 
sky et al., 1998; Chang, 1999; Chapman et al., 1999; Klimas 
et al., 2000; Sitnov et al, 2001; Consolini, 2002; Aachwan- 
den, 2011]. Here, we focus on the Akasofu e parameter which 
is a solar-wind proxy for the energy input into the Earth's 
magnetosphere (see [Koskinen and Tanskanen, 2002] for a 
more recent discussion). In SI units it is defined as 

t = v—llsin{e/2), (15) 

where v is the solar wind velocity, B is the magnetic field, 
Ho = A-K X 10^^ is the permeability of free space, io ~ 7 Re, 
and 6 = arctan(|_By|/_Bz). Geocentric solar magnetospheric 
(GSM) coordinates are used. 

The ACE spacecraft [Stone et al., 1998] orbits the Earth- 
Sun LI libration point approximately 1.5 x lO^m from the 
Earth and monitors solax wind, interplanetary magnetic 
fields and high energy particles. The data can be down- 
loaded from http://cdaweb.gsfc.nasa.gov. Specifically, 
for the years 2000-2007, we extracted the magnitude of the 
x-component of the solar wind, and the y and z components 
of the magnetics fields, as seen respectively by the SWEPAM 
and MAG instruments (level 2 72 data) of the ACE space- 
craft, all in GSM coordinates. The choice of components 
reflect the Poynting flux interpretation of the e parameter. 
For the most part, measurements are available every 64 and 
16s for the wind velocity and magnetic fields, respectively. 
We calculated the e parameter given by Eq. (15) every 64s. 
Since the wind velocity and magnetic field measurements 
are not synchronized, wc linearly interpolated the magnetic 
field measurements towards the time of the nearest wind 
velocity measurement. For these 8 years the e time series 
consisted of 3 944 700 points; see Fig. 15(a). Measurements 
for wind velocities or magnetic fields axe sometimes unavail- 
able. Approximately 9 per cent of the points comprising the 
e series are missing. As in [Moloney and Damdsen, 2011], 
we set missing data points to the value of the last valid 
recording proceeding them (irrespective of the size of the 
data gap), thereby creating plateaus of constant intensity. 
This minimises artifacts associated with points missing at 
regular experiment-specific frequencies. We have checked 
that nothing changes crucially in our statistical analyses by 
adopting other schemes. 

Watkins et al. [2005] have suggested to model the e pa- 
rameter by a fractional Levy motion as defined by Eq. (12), 
which implies that the series of changes in e, Are{tk) = 
e{tk + t) — e{tk) with tk = to + kr for k >0, should form a 
self-similar SaS process. For the e time series studied here 
and shown in Fig. 15(b) for r — 64s, Moloney and David- 
sen [2010] have found that the properties of self-similarity 
and Q-stability both roughly hold for 64s < r < 4h. Specif- 
ically, they found a = 1.55 and H = 0.40 indicating the 
presence of weak anti-persistence. Similar values have been 
observed for other e series (see [Moloney and Davidsen, 2010] 
and references therein) always indicting the presence of anti- 
persistence. Given these properties, one might ask to which 
extent the numerical results for time-dependent hazard as- 
sessment and records statistics of self-similar SqS processes 
presented in Section 5 apply to the Art series. This will be 
addressed in the following. 

6.1. Conditional extrema and hazard assessment 
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Fig. 16 shows that the median of the conditional dis- 
tribution of the block maxima, mcdR{rni)), increases mono- 
tonically and approximately as a power law with mo for 
r = 64s. As T is increased the power law's exponent de- 
creases, eventually leaving the power law regime, and ap- 
proaching the lid case (median independent of condition) in 
the limit of very large t. Note that the median values of the 
block maxima and their corresponding surrogates, obtained 
by shuffling the block maxima, naturally increase for larger 
block sizes since the fewer blocks arc more and more dom- 
inated by the largest extreme values in the time scries (cp. 
r — 8192s). Our findings indicate clustering of extreme val- 
ues for not too large t and, thus, persistence — in sharp con- 
trast to what is expected for anti-persistent self-similar SaS 
processes (see Figs. 11, 12). This contradiction provides 
clear evidence that only some properties of the A,-e series 
can be (roughly) characterized by a self-similar SaS process. 
In particular, the memory reflected in the Are is not reli- 
ably captured by a self-similar SaS process — potentially 
due to non-stationaritics. A similar conclusion has been 
reached by Moloney and Davidsen [2010] who investigated 
the distributions of block maxima of Ai-e. Nevertheless, the 
findings summarized by Fig. 16 show unambiguously that 
time-dependent haaaxd assessment can be successfully ap- 
plied to the considered time series and clearly outperforms 
any time-independent hazard assessment of future extreme 
values. 

6.2. Record Statistics 

The observed clustering in the Ai-e series discussed in 
section 6.1 is also present in the record statistics. Fig. 17 
shows that the probability P{Nr) to find Na records in a 
sequence of length R is rrmch higher for large Nr than what 
is expected for the iid case. This is further confirmed by 
Fig. 18 which shows that the expected number of records 
grows much faster than in the iid case. Both figures also sug- 
gest a dependence on r. In particular, the deviations from 
the iid case for large times increase monotonically with t 
until they reach a maximum around r = 1024s. As r is in- 
creased further the deviations decrease monotonically with 
T and finally they approach the iid case in the limit of very 
large r. This is supported by the probability to observe a 
record at an elapsed time t shown in Fig. 19. Note that in 
contrast to these results found for records the conditional 
maxima in Fig. 16 apparently showed a monotonic depen- 
dence on T. 

6.3. Discussion 

While we have focused here on A^e series, our results 
with respect to time-dependent hazard assessment are di- 
rectly applicable to the e series itself. As Fig. 20 shows, 
there is a strong correlation between AT-e(tfe) and e{tk + t) 
— as measured by the Pearson correlation coefficient — if 
one considers the largest values in AT-e(tfc). This implies 
that a large extreme value in A^e typically indicates a large 
value in e, while a smaller extreme value in Ai-e tends to 
correspond to a smaller value in e independent of the ex- 
act value of T. Thus, one can translate our findings from 
Section 6.1: Extreme values in e tend to be clustered, which 
makes time-dependent hazard assessment in the context of 
space weather feasible. These results are just a first step 
and clearly need to be investigated in more detail in future 
work. 

7. Summary 

In summary, our numerical analysis of self-similar sym- 
metric a-stable processes has shown that the presence of 



long-range memory can load to significant finite size effects 
in extreme value and record statistics despite the fact that 
some of the asymptotic behavior is not affected. The fi- 
nite size effects are particularly pronounced if the long-range 
memory leads to persistent behavior. Wc also found that 
long-range memory allows a time-dependent hazard assess- 
ment of the size of future extreme events based on extreme 
events observed in the past. Time-dependent hazard assess- 
ment based on the value of the previous block maximum sig- 
nificantly outperforms timc-indcpondont hazard assessment 
and, thus, provides a significant improvement. 

Moreover, we showed that such a time-dependent haz- 
ard assessment is directly applicable in the context of the 
solar power influx into the magnetosphere. Since many 
processes in nature are characterized by long-range mem- 
ory and heavy-tailed distributions — especially in space 
physics [Watkins et al., 2005; Moloney and Davidsen, 2010] 
— our flndings are of general interest. Further examples 
outside geophysics include communications traffic [Laskin 
et at., 2002]. 

Appendix 
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Notes 

1. Since min{Xi , . . . , X„} = max{— Xi, . . . , — Xn}, the mini- 
mum can be mathematically treated in the same way. 

2. Similarly, one can define a lower record Xm = min{Xi , . . . , Xm}- 

3. This limitation mainly comes from the loss of validity of the 
Minkowski inequality for a < 1, which is used in [Chechkin 
and Gonchar, 2000] to estimate the upper boundary of H. 
It should nevertheless be possible to generalize the algorithm 
to arbitrary a £ (0, 2] . Moreover, one possible pitfall of the 
algorithm is that the probability density of the generated self- 
similar SqS process is not explicitly renormalized to be the 
same as for the uncorrelated noise it is initialized with. This 
has to be done manually. 

4. A similar supposedly faster algorithm was suggested later [ Wu 
et al, 2004], but not tested by us. 

5. This crossover is not present in the algorithm by Chechkin and 
Gonchar [2000] 

6. We also noted a number of artifacts in the generated time se- 
ries for values a < 1 which we do not consider in this paper. 
One might be able to resolve these artifacts by significantly 
increasing the mesh size parameter m. However, doubling m 
results in practically doubling the memory requirements unless 
the series lenght n is reduced accordingly. We plan on testing 
this more systematically in the future. 

7. Note further that the values were chosen to ensure that 
m{M -(- n) is an integer power of 2. 

8. Note that significant computational resources are required. For 
a single parameter set (a, H), about 15GB axe necessary to 
store the 'double' raw data. Since the computation of the 
involved Fourier transform requires to allocate arrays of size 
m(M + n) and there is a noticeable overhead for computation, 
main memory requirements become an issue very quickly as m 
and M grow. 

9. Uncorrelated SaS distributed random numbers can be gener- 
ated from uncorrelated Gaussian distributed random numbers 
following [Chambers et al, 1976, 1987]. 

lOJVote that the differences between the different surrogate data 
are partly statistical and partly due to the algorithm discussed 
in section 4.2 

Il.This corresponds to the upper 0.999-quantile, or 1 per mille of 
all maxima. 
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Table 1. Shape parameters ^ and corresponding maximum 
error obtained from 95%-coiifideiice intervals of a maximum 
likelihood fit of Eq. (4) to the data. Note that for R < 100 
there was often no convergence in the estimation meaning the 
probability density cannot be properly approximated by the 
GEV distribution in Eq. (4). Positive values of ^ indicate a 
Frccliet distribution. Asymptotically, should approach l/a 
based on theoretical results [Samorodnitsky, 2004]. 





a = 1.5 


a = 1.8 




self-similar SaS 
shape ^ 


shuffled SaS 
shape ^ 


self-similar SaS 
shape ^ 


shuffled SaS 
shape ^ 


H=0.1 

R=1000 

R=10000 


0.6679 ± 0.0017 
0.6582 ± 0.0054 


0.6657 ± 0.0017 
0.6532 ± 0.0054 


0.5751 ± 0.0017 
0.5489 ± 0.0051 


0.5734 ± 0.0017 
0.5448 ± 0.0050 


H=0.2 

R=1000 

R=10000 


0.6682 ±0.0017 
0.6575 ± 0.0054 


0.6653 ± 0.0017 
0.6510 ± 0.0054 


0.5756 ±0.0017 
0.5491 ± 0.0051 


0.5738 ± 0.0017 
0.5443 ± 0.0050 


H=0.3 

R=1000 
R=10000 


0.6682 ± 0.0017 
0.6573 ± 0.0054 


0.6647 ±0.0017 
0.6509 ± 0.0054 


0.5763 ±0.0017 
0.5498 ± 0.0051 


0.5738 ± 0.0017 
0.5451 ± 0.0051 


H=0.4 

R=1000 

R=10000 


0.6683 ± 0.0017 
0.6584 ± 0.0054 


0.6649 ± 0.0017 
0.6501 ± 0.0054 


0.5764 ± 0.0017 
0.5501 ± 0.0051 


0.5739 ± 0.0017 
0.5473 ± 0.0051 


H=0.5 

R=1000 

R=10000 


0.6685 ± 0.0017 
0.6589 ± 0.0054 


0.6654 ± 0.0017 
0.6513 ± 0.0054 


0.5761 ± 0.0017 
0.5515 ±0.0051 


0.5756 ± 0.0016 
0.5496 ± 0.0051 


H=0.6 

R=1000 

R=10000 


0.6684 ±0.0017 
0.6608 ± 0.0054 


0.6667 ±0.0017 
0.6556 ± 0.0054 


0.5751 ± 0.0017 
0.5533 ±0.0051 


0.5756 ± 0.0017 
0.5488 ± 0.0051 


H=0.7 

R=1000 

R=10000 


0.6681 ± 0.0017 
0.6625 ± 0.0054 


0.6675 ± 0.0017 
0.6568 ± 0.0054 


0.5742 ± 0.0016 
0.5530 ± 0.0051 


0.5714 ± 0.0017 
0.5418 ± 0.0051 


H=0.8 

R=1000 

R=10000 


0.6656 ± 0.0017 
0.6622 ± 0.0054 


0.6586 ± 0.0017 
0.6431 ± 0.0054 


0.5496 ± 0.0011 
0.5519 ±0.0051 


0.5653 ± 0.0017 
0.5296 ± 0.0051 


H=0.9 

R=1000 

R=10000 


0.4199 ± 0.0017 
0.6613 ± 0.0054 


0.6410 ± 0.0006 
0.6202 ± 0.0054 


0.2716 ±0.0005 
0.5484 ± 0.0051 


0.5510 ± 0.0017 
0.5108 ± 0.0050 
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12.We start with a set of Arj,in = 20 non-overlapping loga- 
rithmic bins, {^k}k=l,...,Nkin = {[™fe,min,"ifc,max)}fc with 
^fc.min < "ifc,max and mi^min > 0, and associate the geomet- 
ric mean rukfi = (jrik., 

with the condition rukfi- 
Due to the heavy tails in the Prechet distribution, outer bins 
are much less populated than bins in the center. We therefore 
combine bins on both edges, starting from fe = 1 and k = Arj,in 
and proceeding towards the center, until at least 1000 maxima 
mj are contributing to each bin. 
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Figure 1. Illustration on the definition of records in a 
realization {xi}i=i^,.,^R of the process {Xi}i=i,,.,._R. Four 
records (black), ri = xt^^i, r2 — Xt2=4, rs — xt^^e, 
r4 ~ Xt^^io, are identified in the shown fragment. 
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Figure 4. Same as Fig. 3 but for a = 1.8. Note that 
now the independent case corresponds to 7? = 1/q = | 
and we have anti-persistence for H < | and persistence 
for // > |. Thus, the persistence for H = 0.9 is stronger 
and the anti-persistence for H = 0.1 is weaker than in 
the case with a = 1.5 shown in Fig. 3. 



Figure 2. Definition of a sequence of block max- 



ima nij — ma,x{xjii, 
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tially shown in (b)] by considering disjunct blocks of size 
R = 100 in the underlying time series {2;i}i=o,i,...,iv-i 
[partially shown in (a)]. 
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Figure 5. Exceedance probability, i. e., the probability 
to find an extreme event larger than m. Shown are results 
for self-similar SaS time series with H = 0.9 and the cor- 
responding independent surrogates (black solid lines) for 
different block sizes R (symbols) and for different char- 
acteristic exponents (a) a — 1.5 and (b) a = 1.8. 



Figure 3. Probability density functions of block max- 
ima for a — 1.5 and different values of H and R. (a) 
and (b) are examples of anti-persistence while (c) and 
(d) correspond to examples of persistent behavior since 
H = 1/a = 2/3 is the boundary between the two regimes. 
Shown are four different block sizes i? = 10 [stars, light 
gray], R — 100 [diamonds, gray], R — 1000 [trian- 
gles, dark gray], and R — 10000 [squares, black]. The 
surrogate data for the corresponding independent SaS 
processes are indicated by solid black lines. Maximum- 
likelihood GEV fits are plotted as dotted curves (surro- 
gate data) and dashed curves (original data) in the same 
gray tones for comparison. 
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Figure 6. Convergence of the shape parameter to- 
wards the theoretical value 1/a as 7? — >■ oo obtained 
from GEV fits for four different self-similar SaS time 
series [gray-scale coded dash-dotted lines and symbols] 
and corresponding independent surrogates [solid lines in 
same gray] — H=OT [light gray, stars], H=0.2 [gray, dia- 
monds], H=0.8 [dark gray, triangles], and H=0.9 [black, 
squares]. Upper curves correspond to a = 1.5 while lower 
curves belong to a = 1.8. Error bars are 95%-confidence 
intervals of obtained from the GEV fit after combining 
the data of all 1500 configurations. 
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Figure 8. Rescaled versions of probability density func- 
tions shown in Fig. 3; the same gray tones and symbols 
are used. The expected asymptotic scaling of the tails is 
indicated by the dark gray dashed lines (a' = — 1 — a = 
-2.5). 
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Figure 9. Same as Fig. 8 but for a = 1.8. 



1000 

R 



10000 100000 



Figure 7. Dependence of (a) the mean (m) r and (b) the 
median medR block maximum together with (c) the scale 
parameter from a GEV fit on the block size R. Shown 
are results for four different self-similar SaS time series 
[gray-scale coded symbols] and corresponding indepen- 
dent surrogates [solid lines in same gray tones] — H=0.1 
[light gray, stars], H=0.2 [gray, diamonds], H=0.8 [dark 
gray, triangles], and H=0.9 [black, squares]. Upper curves 
belong to a = 1.5 and lower curves to a = 1.8. Shown 
are averages of (m)_R and med_R obtained separately for 
each configuration (A'^cont = 1500) together with the en- 
semble standard deviations as error bars in (a,b). Ex- 
pected rescaling exponents are indicated by dashed and 
dash-dotted lines, respectively. Note that the y-axis was 
rescaled by _R°'® to enhance differences. Error bars in (c) 
are 95%-confidence intervals of an obtained from a GEV 
fit after combining all TVconf configurations to increase 
statistics for large R. 
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Figure 10. Examples of the conditional probability den- 
sity functions for block maxima (a,b) and corresponding 
exceedances (c,d) for R = 100. Left panels (a,c) refer to 
a = 1.5 and right panels (b,d) to a = 1.8. Shown are 
results for block maxima m that are preceded by a block 
maximum mo > mthrcshow for a (large) self-similarity in- 
dex H = 0.9 [black circles] and for the corresponding 
randomized sequence of block maxima [gray triangles]. 
For comparison results for the complementary condition 
mo < mthrcshoid are also displayed (self-similar SaS pro- 
cesses [dark gray squares], independent surrogate [light 
gray diamonds]). The threshold has been chosen to en- 
sure 15000 conditional maxima [(a,c) mthrcshoid ~ 473, 

(b,d) mthreshold ~ 102]. 



Figure 11. Median of the conditional distribution of 
block maxima, medij(mo), given a preceding block max- 
imum of mo for a = 1.5. Results for different block sizes 
(a) R = 10, (b) R = 100, (c) R = 1000, (d) R = 10000 
are shown. Different self-similarity exponents are both 
gray tone- and symbol coded; H = 0.1 [very light gray, 
stars], H = 0.5 [light gray, squares], H = 0.6 [gray, open 
diamonds], H = 0.7 [dark gray, filled diamonds], H = 0.8 
[very dark gray, filled lower half circles], H = 0.9 [black, 
circles with plus]. The values expected in the absence of 
memory effects between block maxima are shown in the 
same gray-scale coding but with smaller open circles (see 
text for details). Note that both corresponding curves 
for data with and without memory must intersect. Error 
bars are obtained from bootstrapping [Efron, 1979; Efron 
and Tibshirani, 1993] within each bin: We draw the same 
number of elements as in the bin with a cap at 50000 and 
consider 10000 realizations per bin. The insets in (a,b) 
show magnifications of the intersection area. 




Figure 12. Same as Fig. 11 but for a characteristic ex- 
ponent a = 1.8. The main panels have the same x-axes 
as in Fig. 11 to allow a better comparison. 
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Figure 13. Probability of finding a given total number 
of records in sequences of block size R — 10000 for self- 
similar SaS processes with different parameters a and H 
(symbols). Results for corresponding surrogate data are 
shown for comparison (lines). 
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Figure 14. (a,c) Probability to break a record at a 
time t for different values of H and a. Black solid lines 
correspond to independent surrogate data (H = 1/a) 
which don't deviate significantly from the theoretically 
expected P(Jft = max{xi, a;t}) = 1/t — see Eq. (5). 
(b,d) Expected total number of records E{Nt) that occur 
up to time t in the same color- and symbol coding as in 
(a,c). The theoretical iid behavior is E{Nt) — ln\t\ -I- 7; 
see Eq. (6a) . Significant deviations from the iid behavior 
are visible in the presence of strong persistence (H — 0.9, 
Q = 1.8). 



Figure 16. Median of the conditional distribution of 
block maxima medn(mo) given a preceding block max- 
imum of mo within a logarithmic bin for the A^e se- 
ries with different values of r [symbols and line style 
coded]. The block size is i? = 100. Surrogate data (ob- 
tained by shuffling the A^e = 64s series' block maxima) 
corresponding to the iid case are shown for comparison 
[solid curve, circles]. Surrogates for other values of r (not 
shown) are similar but increase monotonically with r. 
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Figure 17. Probability density function for the number 
of records in non-overlapping blocks of size R — 1000 of 
the At-e series for different values of r and for surrogate 
data (obtained by shuffling the A^e series) corresponding 
to the iid case are shown for comparison. 



Figure 15. (a) Akasofu's e{t) time series derived from 
ACE data according to Eq. (15) for the years 2000 to 
2007 with a sampling of 64s. (b) AT=64se(tfc) time series 
obeying a symmetric heavy tailed distribution as shown 
by Moloney and Davidsen [2010]. 
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Figure 18. Expected number of records as a function 
of values k for the Art series for different values of r. 
Estimates are based on non-overlapping blocks of size 
R = 1000. Surrogate data (obtained by shuffling the 
A^e series) corresponding to the iid case are shown for 
comparison. 



Figure 19. Probability to observe a record at values k 
for the At-e series for different values of r. Estimates are 
based on non-overlapping blocks of size R = 1000. Sur- 
rogate data (obtained by shuffling the A^e series) corre- 
sponding to the iid case are shown for comparison. 
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Figure 20. (a) Probability density function to observe 
a given pair of values e{tk + t) and AT-e(tfc) — e{tk + 
t) — e{tk) simultaneously for a fixed value of r = 256s 
and only for pairs where ATe{tk) > 0; logarithmic bins 
were used. The dashed lines indicate from top to bottom 
the 99.9% [black, diamonds], 99.5% [gray, triangles], and 
the 99% [light gray, squares] quantile of the A256se dis- 
tribution, (b) Pearson's correlation coefficient calculated 
for the same upper quantiles as in (a) [same symbols and 
gray tones] for all time series pairs (^Are{tk),e{tk + r)) as 
a function of the time shift r. 



